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ABSTRACT 

The myotonic dystrophies (DM) are human diseases 
in which the accumulation of toxic RNA (CUG or 
CCUG) repeats in the cell causes sequestration of 
splicing factors, including MBNL1, leading to clinical 
symptoms such as muscle wasting and myotonia. 
We previously used Dynamic Combinatorial Chem- 
istry to identify the first compounds known to inhibit 
(CUG)-MBNLI binding in vitro. We now report trans- 
formation of those compounds into structures with 
activity in vivo. Introduction of a benzo[g]quinoline 
substructure previously unknown in the context of 
RNA recognition, as well as other modifications, 
provided several molecules with enhanced binding 
properties, including compounds with strong se- 
lectivity for CUG repeats over CAG repeats or 
CAG-CUG duplex RNA. Compounds readily pene- 
trate cells, and improve luciferase activity in a 
mouse myoblast assay in which enzyme function is 
coupled to a release of nuclear CUG-RNA retention. 
Most importantly, two compounds are able to par- 
tially restore splicing in a mouse model of DM1. 



INTRODUCTION 

High affinity, sequence-selective recognition of RNA by 
synthetic molecules is increasingly recognized as a key 
strategic goal for the production of novel therapeutics 
and biochemical probes (1). The importance of this area 
is accentuated by the ever-increasing pace of discovery of 
new RNA sequences with biochemically important (and 
therefore potentially biomedically important) functions. 
Among the many recent advances in this area is the rec- 
ognition that many non-coding RNA (ncRNA) elements 



present in the eukaryotic RNAome play a direct role in 
controlling cellular processes and disease (2). Human 
diseases believed to have an ncRNA origin include spin- 
ocerebellar ataxia, fragile X-syndrome, diabetes mellitus, 
myoclonus epilepsy and the myotonic dystrophies (DM) 
(3). However, to date only a relatively small number of 
compounds have been reported that bind specific RNA 
sequences and elicit a desired biological response. For 
these reasons, expanding the pool of sequence-selective 
RNA-targeted synthetic molecules presents a critically 
important but under-examined challenge in chemical 
biology. 

The DM are central examples of a growing family of 
RNA-mediated diseases (3-5). Myotonic dystrophy type 1 
(DM1) is the most common form of adult-onset muscular 
dystrophy, affecting ~1 in 8000 people (6). An autosomal 
dominant inherited disease, DM1 results from a CTG 
repeat expansion (CTG exp ) in the 3'-UTR of the DM 
protein kinase gene (DMPK) on chromosome 19q. The 
expanded CTG is transcribed into long CUG exp repeat 
mRNA. These RNA repeats sequester RNA-binding 
proteins such as the MBNL (muscleblind-like) family of 
splicing regulators, retaining them in the nucleus as foci. 
This in turn leads to misregulated alternative splicing, or 
spliceopathy. Myotonic dystrophy type 2 (DM2) is caused 
by an unstable expansion of a CCTG repeat in intron 1 of 
the cellular nucleic acid-binding protein gene (CNBP, also 
known as the zinc finger protein 9 gene or ZNF9) on 
chromosome 3q. Transcription produces toxic mRNA 
containing hundreds to thousands of CCUG exp . Like 
CUG exp , these are also sequestered into foci, and deplete 
MBNL1 protein from the affected cell (7,8). Currently, 
there is no pharmaceutical therapy for either DM1 or 
DM2. However, the molecular understanding of these 
diseases suggests that displacement of MBNL1 from its 
CUG exp or CCUG exp binding sites constitutes an attract- 
ive strategy for developing therapies targeting DM (8,9). 
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Experiments in animal models support this hypothesis. 
For example, a morpholino antisense oligonucleotide 
(CAG-25) complementary to CUG exp RNA was able to 
hybridize with the CUG exp RNA in vitro, and displace 
MBNL1. Intramuscular injection of CAG-25 in a trans- 
genic DM1 mouse model partially restored chloride 
channel- 1 (Clcn-1) protein expression, and DM1 
symptoms lessened (10). The high cost and challenging 
pharmacological properties of oligonucleotide-based 
drugs suggest, however, that alternative approaches to tar- 
geting CUG exp RNA are of value. 

In the absence of well-defined rules guiding the design of 
sequence-selective RNA-binding molecules [such as those 
developed by Dervan et al. (1 1) for DNA recognition], the 
question arises as to how one successfully addresses the 
binding problem. Several years ago, a number of groups, 
including ours, developed the concept of Dynamic 
Combinatorial Chemistry (DCC) as a 'rapid prototyping' 
method for testing binding hypotheses and facilitating the 
identification of compounds with novel architectures and 
properties (12). Since that time, DCC has evolved in many 
directions, demonstrating its potential as a method for the 
identification of receptors for small-molecule analytes, 
catalysts, new materials, sensors and a broad range of 
compounds able to bind protein and nucleic acid targets. 
However, to our knowledge no DCC-derived hit 
compound has either directly or through analog produc- 
tion resulted in a structure with in vivo activity. 

In 2008, we reported the first non-nucleic acid-based 
compounds (1 is a representative structure) capable of 
binding CUG exp RNA and competitively inhibiting 
CUG exp -MBNLl binding in vitro (13). This work relied 
on a resin-bound form of DCC, termed RBDCC, that we 
developed to facilitate the identification of sequence- 
selective DNA (14) and RNA (13) binding compounds. 
Several groups have subsequently demonstrated elegant 
and structurally varied approaches to binding CUG exp 
and CCUG exp RNA (8,15-19). This recent upsurge of 
interest highlights the fact that DM1 and DM2 RNAs 
are important therapeutic targets, as well as valuable 
model systems for testing hypotheses regarding the 
factors influencing selectivity and affinity in RNA recog- 
nition. Despite these advances, demonstration of the res- 
toration of MBNL1 activity in vivo by cell-permeable, 
highly selective CUG exp RNA binders remains an import- 
ant goal. 

RBDCC hit compound 1 (Figure 1) and related mol- 
ecules identified in our initial work provided a useful dem- 
onstration of feasibility, and set the stage for building 
toward a compound that would be suitable for further 
evaluation in the biological context. To accomplish that 
goal, we anticipated that replacing the disulfide bridge 
with an olefin bioisostere would not have a dramatic 
impact on affinity, based on results from parallel efforts 
in our lab targeting an RNA sequence involved in 
regulating —1 ribosomal frameshifting in HIV (20). 
Since disulfides are easily reduced in the cytoplasm, 
replacing the disulfide with an olefin or alkane would fa- 
cilitate cellular studies. Second, molecules containing 
hydrocarbon bridges of varied length would allow us to 
examine the effect of linker length and configuration on 



binding ability and selectivity. Third, we wished to expli- 
citly examine the importance of the amino acid sequence 
order. Finally, as quinolines are known intercalators, at 
least in the DNA-binding context (21), we hypothesized 
that increasing the pi surface area of this group would 
enhance affinity. In this regard, we were surprised to 
discover that despite the vast amount of research con- 
ducted into the nucleic acid recognition properties and 
biological activity of acridine derivatives, including the 
use of several acridines in humans as antimicrobials (22) 
and chemotherapeutic agents (23), we are only aware of 
one mention of the closely related benzo[g]quinoline 
heterocycle (i.e. 2, Figure 1) in the nucleic acid recognition 
literature (24). Thus, synthesizing and testing derivatives 
incorporating this moiety would constitute the first exam- 
ination of this heterocycle in the RNA binding context. 

MATERIALS AND METHODS 

Benzo[g]quinoline 2 was synthesized by condensation of 
methyl acetoacetate and 3-nitro-2-naphthaldehyde (25), 
using a one-pot procedure originally developed by us in 
the context of quinoline synthesis (26). Compounds 3-9 
were synthesized on solid phase by analogy to methods 
previously reported by our group (20). For compounds 
10 and 11, L-pentenyl glycine was synthesized via asym- 
metric alkylation of pseudoephedrine glycinamide (27). 
Complete synthesis procedures and compound character- 
ization are provided in Supplementary Data. 

Surface plasmon resonance (SPR) analysis 

All SPR experiments were conducted using a Biacore-X 
instrument (Biacore, Inc.). Both flow cells (FC1 and FC2) 
of a research grade carboxymethyl dextran coated sen- 
sor chip (CM5, GE Healthcare) were functionalized 
with streptavidin following activation by EDC/NHS. 
Unreacted NHS-ester was deactivated with ethanolamine. 
Next, a known amount (response unit, RU, between 
200-1000) of 5'-biotin labeled RNA (IDT Inc.) in 
running buffer was captured in FC2; FC1 was blocked 
with biotin and used as a reference. The level of RNA 
immobilized was limited to reduce mass transfer effects 
on the association phase. Binding analyses were carried 
out by flowing various concentrations of compound in 
HBS-N buffer (0.01 M HEPES, 0.15 M NaCl, pH = 7.4) 
at 30 or 60 ul/min over the immobilized RNA and record- 
ing the reference-subtracted sensorgrams. For kinetic 
experiments, dissociation was monitored for at least 
200 s after which the surface was regenerated, where ne- 
cessary, with a pulse of either 0.5 or 1 M NaCl, followed 
by buffer wash to reestablish baseline. Each compound 
concentration was injected twice to verify consistency. 
Blank subtraction was performed by subtracting the result 
obtained from the injection of buffer alone. Equilibrium 
binding constants (^d) an d kinetic rate constants 
(k on , k ot{ ) were obtained by globally fitting sensorgrams 
to the 1:1 Langmuir equation using Biaevaluation 
software (Biacore, Inc.). Use of the 1:1 Langmuir 
equation assumes that each binding site on the RNA is 
identical; this is of course an oversimplification. Reported 
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apparent K D values are an average of two separate 
SPR experiments on different sensor chips, with dif- 
ferent surface densities of immobilized RNA to verify 
that measured values were not impacted by surface 
RNA density. Dissociative half-lives t\j 2 were 
calculated from the dissociation rate constants k d 
{t\j2 = \n2/k d ). Dissociative rates were calculated using a 
single-exponential decay. This methodology has been 



proven robust even for complexes with slower off rates 
than those observed in this study, and in cases where the 
SPR signal does not return to baseline within the meas- 
urement time frame (28). In part, this is because 
competing fast processes manifest as changes to early 
time points in the SPR dissociation curve. Stoichiometry 
n was assessed by repeating each SPR experiment with 
increasingly higher ligand concentration to saturation 
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(RU max ), followed by Scatchard analysis (r/Cf ree versus r) 
of the steady-state RUs. All error measurements were 
calculated as described in the literature (29,30) (Sup- 
plementary Methods, and Supplementary Tables S1-S44). 

Fluorescence titrations 

RNA (10 uM stock) in HBS-N buffer was titrated into a 
solution of 1 uM compound in HBS-N. After each 
addition of RNA, lOmin were allowed for equilibration 
before recording the change in fluorescence at 468.5 nm. 
Fluorescence units (FU) were then corrected for dilution 
and the FU after each addition was subtracted from the 
FU at zero RNA concentration to give AFU. Following 
saturation, AFU was plotted against RNA concentration 
using Origin 7 (OriginLab, Inc.), and fitted to a 1:1 
Langmuir binding model to obtain apparent dissociation 
constants (K D ). The reported are a composite fit from 
at least two repetitions. 

Cell permeation 

C2C12 mouse myoblasts grown to 80% confluence were 
exposed to compounds for 12 h in a 96-well tissue culture 
plate. After removal of the culture media [DMEM con- 
taining 10% FBS, 1% penicillin-streptomycin (GIBCO)], 
the cells were washed twice with PBS to remove excess 
compounds. Cells were then imaged while in buffer 
under a fluorescence microscope (Olympus 1X70) in the 
96-well plate using a 460-nm emission filter. 

Cell toxicity 

C2C12 mouse myoblasts were plated in a 96-well tissue cul- 
ture plate in DMEM (10% fetal bovine serum, 1% peni- 
cillin-streptomycin) and allowed to grow to ~80% 
confluence at 37°C under C0 2 . Varying compound con- 
centrations (up to 1 mM) were incubated with cells 
(48 h, 37°C). Media was then removed, and 3-(4,5- 
dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide 
(MTT) in media was added to each well and incubated 
at 37°C for 4h. After removal of the MTT media, isopro- 
panol (100 ul) was added, and absorbance was measured 
at 600 nm on a Modulus microplate reader (Turner 
Biosystems). 

Activity in transformed mouse myoblasts 

C2C12 mouse myoblasts were engineered to stably express 
a firefly luciferase transcript with or without ~800 uninter- 
rupted CUG repeats in an hDMPK 3'-UTR (clones C5-14 
and Cl-S, respectively). Cells were grown in 100 ul growth 
media in 96-well plates to ~10% confluency. Compound 
was added at various concentrations in triplicate to both 
Cl-S and C5-14 cultures, and incubated for 3 days. 
Culture media was replaced with fresh media containing 
1% WST-1 reagent (Roche). After ~15-20min of incuba- 
tion, the WST-1 containing media was transferred to a 
clear 96-well plate, and the absorbances at 450 and 
690 nm measured with a PerkinElmer En Vision Plate 
Reader. Cells were gently rinsed (lx PBS) before incuba- 
tion at — 20° C in 100 ul 1 x Passive Lysis Buffer (Promega) 
for lOmin. Immediately before luminescence detection, 



20 ul of each lysate was mixed with 50 jal of Luciferase 
Assay Reagent (Promega) in a fresh opaque-white 
96-well plate. Luminescence values were normalized for 
well-to-well variations in viable cell numbers by dividing 
by the corresponding (A450-A690 nm) values of the 
WST-1 containing media. 

Mouse studies 

Mouse handling and experimental procedures were con- 
ducted in accordance with the Association for Assessment 
and Accreditation of Laboratory Animal Care. HSA^ K 
transgenic mice in line 20 b expressing human skeletal 
actin RNA with 220 CUG repeats in the 3'-UTR were 
previously described (while initially incorporating 250 
CUG repeats, these have subsequently shortened) (31). 
Initial experiments were conducted with age- and 
gender-matched HSA^ R mice (12-16 weeks old), while 
follow-up experiments on compound 11 used age- 
matched mice, with four males and one female in each 
group. In each case, experimental animals were injected 
intraperitoneally with 40mg/kg of compounds or saline 
alone once per day for 5 days. Mice were sacrificed 1 
day after the last injection, and vastus muscle was 
obtained for splicing analysis. RNA extraction, cDNA 
preparation and RT-PCR were performed as described 
previously (31,32). Each RT-PCR reaction was performed 
in triplicate. The PCR products were separated on agarose 
gels, and scanned with a laser fluorimager (Typhoon, GE 
Healthcare). Quantitative analysis of amplified products 
was performed by ImageQuant software (Molecular 
Dynamics). Differences between two groups were 
evaluated by Student's r-test. 

RESULTS 

The binding characteristics of 3-11 were evaluated against 
pathogenic CUG exp RNA and other RNA sequences by 
surface plasmon resonance (SPR). This technique allows 
the association and dissociation rates (fc a and kd), equilib- 
rium binding constants (K D ) and the binding stoichiom- 
etry («) to be measured in a label-free format (33). RNA 
sequences, shown in Figure 2 along with their respective 
abbreviated names, were designed in order to test the 
effect of increasing numbers of CUG repeats ['(CUG) 2 \ 
'(CUG) 4 ' and '(CUG)i 0 '], differences between recognition 
of DM1 and DM2 RNA ['(CCUG)io'], U-U versus A-A 
mismatches ['(CAG)io'] and mismatched RNA versus a 
CUG-CAG duplex ('Duplex'). An RNA sequence 
derived from the gag/pol frameshift-stimulating region of 
HIV-1 ('HIV-1 FSS') (34) was employed as an off-target 
control. 

Equilibrium affinities 

Compound 3 showed no measurable affinity for (CUG)i 0 . 
In contrast, 4 and 5 bound (CUG)io with apparent 
values of 39 and 68 nM, respectively, in both cases repre- 
senting a roughly 50-fold improvement in affinity relative 
to 1 (6.7 uM, as measured by filter binding) (14). 
Scatchard analysis of the steady-state SPR response 
units (RU) for the binding between 4 or 5 and (CUG)i 0 
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Figure 2. Thermodynamic affinities (apparent K D ,nM) and selected stoichiometrics (n) as measured by SPR. Binding constants are reported as an 
average of two complete titrations determined at differing surface RNA density; measured K D values for each run and associated errors are provided 
in Supplementary Data. 



yielded a binding stoichiometry of 10:1 for each isomer. 
This implies that one molecule of compound binds one 
(CUG) trinucleotide in the (CUG) 10 RNA hairpin stem 
and loop. The Scatchard plot for these data displays a 
strong convex curvature (Supplementary Data), and Hill 
coefficients of 1.9 (compound 4) or 1.5 (compound 5) were 
calculated, all consistent with positively cooperative 
binding (35,36). Both the apparent affinity and binding 
stoichiometry (the latter as a function of number of 
CUG repeats) remained consistent for 4 binding to 
(CUG) 4 and (CUG) 2 , confirming that one molecule of 4 
binds to each CUG. In contrast, the affinity of 5 actually 
appeared to strengthen slightly for (CUG) 4 and (CUG) 2 . 
Reducing the number of CUG repeats in the target RNA 
yielded linear Scatchard plots for 4 (Supplementary 
Tables S3 and S4). This confirms an expected loss of 
cooperativity as the number of CUG repeats decreases. 
Likewise, calculated Hill coefficients were consistent with 
non-cooperative binding to short repeats. One possible 
explanation for this is that the longer (CUG)i 0 repeat 
RNA has greater flexibility, and ligand binding induces 
a conformational change which then affects the mode of 



binding of subsequent ligand molecules. While 4 and 5 are 
strongly selective for binding CUG repeat RNAs relative 
to unrelated hairpins such as the HIV-1 FSS (apparent K u 
1240 and 759 nM, respectively), both (CCUG)i 0 and 
(CAG)io were bound with similar affinities to the CUG 
repeat sequences. It is not surprising that differentiation 
among these sequences is particularly challenging. 
MBNL1 itself has been reported to bind CUG and 
CCUG repeats with only a 2-fold difference in affinity 
as measured by gel-shift assay (37). Likewise, MBNL1 
was found to bind (CUG) and (CAG) repeats with 
similar affinity in a filter binding assay (38). The X-ray 
crystal structures of a (CUG) 6 repeat duplex and a 
(CAG) repeat containing RNA duplex showed that both 
adopt A-form RNA conformation with alternating stripes 
of positive and negative potential due to the G-C pairs 
(39^41). It has been predicted that the (CCUG) repeat 
RNA may also adopt a similar conformation (38). 
Binding of 4 to duplex CUG-CAG appears to be 
~4. 6-fold stronger than the E isomer, 5. 

To ascertain the importance of the benzo[g]quin- 
oline moiety, we examined cis (6) and trans (7) peptides 



lacking the heterocycle. While both compounds 
bound (CUG)io RNA with modest affinity, neither 
displayed selectivity for the CUG repeat. Likewise, 
benzo[g]quinoline-bearing compounds with a 'scrambled' 
peptide sequence (8 and 9) showed reduced affinity to 
(CUG) 10, and no selectivity for CUG or CCUG repeats 
relative to duplex CUG-CAG or HIV-1 FSS RNAs. Both 
observations suggest that a substantial fraction of the 
functionality in 4 and 5 must be present in the correct 
relative spatial orientation in order for high affinity rec- 
ognition to be achieved. 

A key aspect of the original RBDCC library design was 
that the disulfide was incorporated only to reversibly link 
RNA-binding modules together, rather than participating 
in RNA binding itself. Such a reversible linkage is a 
critical feature of any dynamic combinatorial library, 
but depending on the exchange reaction used may 
require subsequent re-engineering of the compound in a 
form not subject to exchange under physiological condi- 
tions. In this case, replacement of the labile disulfide 
bridge with an olefin allowed us to not only improve the 
biostability of the compound, but also provided an oppor- 
tunity to test the effect of varying the spacing between 
modules on binding. This was also motivated by the 
complete lack of binding by monomer 3 and benzo[g]quin- 
oline 2, suggesting that a bidentate mode of binding 
stabilized by the two heterocyclic handles is required. 
For this hypothesis to hold, one would anticipate that 
changing the separation between the two halves of the 
ligand would alter selectivity and affinity. Thus, we 
synthesized 'extended' linker compounds 10 and 11. 
These compounds were found to bind (CUG)io with an 
affinity similar to that displayed by 4 and 5; however, 
affinity for 11 decreased in tandem with decreasing 
hairpin length [from an apparent K D of 22.5 nM for 
(CUG)io to 668 nM for (CUG) 2 J. Binding stoichiometry 
also changed, with 5:1 binding observed for (CUG)i 0 and 
1:1 binding observed for (CUG) 4 . This more stringent 
structural requirement also manifested as a dramatic 
increase in sequence selectivity: 11 has a significant pref- 
erence for (CUG)io over (CCUG)i 0 , and a 38-fold prefer- 
ence versus (CAG)io- No measurable binding was 
observed by SPR to the duplex CUG-CAG sequence, 
while binding to the HIV hairpin was reduced 89-fold. 
As 10 was isolated as only a very minor product of the 
metathesis reaction, only a limited number of experiments 
were possible with this compound. However, SPR 
suggests it is also somewhat selective for (CUG)i 0 over 
(CCUG)i 0 . Likewise, 10 binds (CUG)j 0 with a 5:1 stoichi- 
ometry. Binding of 11 to (CUG)io is cooperative (Hill 
coefficient of 1.9). The 2-fold difference in stoichiometry 
for 'extended' compounds 10 and 11 relative to 4 and 5 is 
readily observable in the SPR trace. For example, an in- 
jection of excess 10 produces roughly half the steady-state 
response of the injection of an equivalent concentration of 
5 (Supplementary Figure S2). These data are consistent 
with the requirement of a more distributed binding site. 
Neomycin, a well-studied aminoglycoside antibiotic with 
relatively low sequence selectivity, has been reported to 
bind (CUG) repeats (9), and represents a useful positive 
control from another structural class. We found that 
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Table 1. Binding constants as measured by fluorescence titration 



Compound 


Sequence 


Apparent K D (nM) 


3 


(CUG)i 0 


No binding 


4 


(CUG) 10 


66 ± 2 


5 


(CUG) 10 


70 ± 1 


10 


(CUG) 10 


40.4 ± 0.3 


11 


(CUG) 10 


57 ± 1 


11 


(CCUG) 10 


73.6 ± 0.8 



Values are a composite fit of at least two complete titrations ± 1 SD on 
the fit. 



neomycin binds (CUG)i 0 and (CCUG) 10 with much 
weaker affinity (apparent K u of 409 and 1955 nM, respect- 
ively) than the best compounds described earlier. 

Selected equilibrium affinities for (CUG)i 0 were con- 
firmed by fluorescence titration. The benzo[g]quinoline 
moiety is fluorescent, with excitation and emission 
maxima at 362 and 439 nm, respectively, in methanol. 
The quantum yield of the ethylbenzo[g]quinoline 
carboxylic acid (2), relative to quinine sulfate, was 
determined to be 0.64 (Supplementary Figure S3). This 
unique feature allowed us to directly monitor RNA 
binding, and mammalian cell penetration and localization 
without the need for additional labeling. In all cases, 
compound binding resulted in saturable quenching of 
fluorescence (Supplementary Figures S4-S6). Apparent 
K D values measured in this manner are consistent with 
those obtained by SPR (Table 1); that they are not iden- 
tical to SPR-measured values likely results from the dif- 
fering formats of the two methods. As a control, yeast 
tRNA was titrated into a mixture of 4 and 5; no binding 
was observed. 

Analysis of binding kinetics 

While equilibrium binding constants are essential for 
understanding binding selectivity, they do not provide a 
complete picture of the recognition process. In the ligand- 
protein binding field, an emerging consensus holds that 
kinetic constants, and in particular the off-rate, are key 
determinants of selectivity (42^14). Similar observations 
have been reported for DNA-binding compounds 
(45-48), but to date there is little information addressing 
this hypothesis in the context of RNA binding (and no 
information with regard to how this affects compound 
behavior in a cellular environment or in vivo, where com- 
pounds must compete with other cellular constituents for 
target RNAs). 

For SPR-derived kinetic constants, we observe that 
on-rates generally do not vary substantially for individual 
compounds across a series of RNAs. In contrast, off rates 
strongly reflect differences in selectivity, consistent 
with the above hypothesis. These data are perhaps easier 
to conceptualize in terms of dissociative half-life, or t l / 2 , 
for the interaction with target sequences (Table 2). 
Comparisons among compounds and target sequences 
based on dissociative half-life are a useful addition to 
commonly employed metrics such as the selectivity 
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index, and may have some advantages since the former is 
an absolute measurement while the latter is defined based 
on a changeable set of RNAs. For the compounds studied, 
11 displays the largest differences in ? 1/2 , consistent with its 
highest selectivity for CUG repeats: an ~25-fold differ- 
ence between (CUG)io and (CCUG)io, and an ~47-fold 
difference between (CUG)i 0 and the HIV-1 FSS. 

Prior to analyzing the ability of selected compounds to 
interfere with CUG exp mediated effects in model cell lines, 
it was crucial to first establish that they are capable of 
crossing the cell membrane, and are non-toxic at experi- 
mentally relevant concentrations. Cell permeability was 
assessed in human fibroblasts, and in mouse C2C12 myo- 
blasts. In both cases, penetration of the cell membrane was 
readily observed because of the fluorescence of the 
benzo[g]quinoline chromophore; compounds also appear 
to localize preferentially in the nucleus (Figure 3). Since 
the sequestration of MBNL1 by pathogenic CUG exp RNA 



Table 2. Residence time (tin, s) for compound binding to RNA 
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(CUG) 2 
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(CUG) 10 
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(CAG) 10 


Duplex 


HIV-1 
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548 
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ND 


ND 
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occurs entirely in the nucleus, this apparent selective 
localization even in non CUG exp -containing cell lines is 
particularly notable. Of course, the mechanism of 
compound entry is not revealed by this analysis. 
Interestingly, benzo[g]quinoline 2 showed only a modest 
ability to enter cells, and 'scrambled' compound 9 was 
significantly less able to enter cells than 4, 5 or 11 
(Supplementary Figures S7-S10). Toxicity of 4 and 5 to 
mouse myoblasts was measured by MTT assay (49). No 
significant toxicity was observed at concentrations up to 
100 uM. In contrast, mitomycin C, a commonly employed 
DNA-targeted cancer chemotherapeutic agent, showed 
significant toxicity at all concentrations tested (Figure 4). 
Similar results were obtained in preliminary experiments 
with human fibroblasts (Supplementary Figure Sll). 
Benzo[g]quinoline 2 was also tested, and found to have 
no toxicity in fibroblasts at concentrations up to 500 uM 
(Supplementary Figure S10), an expected result given its 
limited ability to cross the cell membrane. 

We then evaluated the ability of selected compounds to 
inhibit the nuclear retention of CUG exp RNA, a phenom- 
enon that depends on formation of RNA-MBNL1 
complexes, in mouse myoblasts. An engineered cell line 
(C5-14) expressing a luciferase-encoding mRNA with 
800 CUG repeats in the 3'-UTR was used. Formation of 
(CUG)goo RNA MBNL1 complexes in the C5-14 cell line 
suppresses translation of this mRNA, and therefore 
suppresses the cellular level of luciferase. Inhibition of 
protein complexation to the (CUG) 80 o RNA allows trans- 
lation to occur, restoring luciferase expression. A second 
cell line (Cl-S) carrying an analogous luciferase construct 
lacking the CUG repeats in the 3'-UTR was used as a 
positive control for luciferase expression. After treatment 
with a morpholino antisense oligonucleotide (CAG-25) 
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Figure 3. Compounds readily penetrate cell membranes and localize in the nucleus. Representative bright field (lower) and fluorescence (upper) 
images are shown for compound 5 in mouse myoblasts. 
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Figure 4. MTT assays indicate low toxicity of 4 and 5 in mouse 
myoblasts. The significantly higher toxicity of mitomycin C is shown 
for comparison. Error bars indicate single SDs of three replicates of 
each concentration. 



complementary to the CUG repeat RNA, C5-14 cells 
showed an increase in luciferase expression consistent 
with direct interaction with the (CUG) 80 o mRNA in the 
nucleus (J. Hoskins et al., manuscript in preparation). The 
increase in luciferase activity is accompanied by a disrup- 
tion of CUG foci, as seen by fluorescence in-situ hybrid- 
ization (FISH). We hypothesized that binding of our 
compounds to the (CUG) 8 oo repeat in C5-14 cells would 
likewise promote release of the luciferase transcripts from 
the nucleus, resulting in increased luciferase expression 
and activity. In contrast, luciferase activity in positive 
control (Cl-S) cells should not be affected by the 
presence of CUG RNA binding compounds. Indeed, 
incubation of various concentrations of 4, 10, or 
11 (0-100 uM) with C5-14 myoblasts resulted in 
concentration-dependent increases in luciferase activity, 
consistent with our hypothesis (Figure 5). Compound 7 
(lacking selectivity to CUG exp RNA) had no effect on 
luciferase activity when incubated with C5-14 cells under 
the same conditions. No statistically significant change in 
luciferase signal was observed for Cl-S cells (which lack 
CUG repeats) treated with compounds 10 or 11; however, 
we observed a concentration-dependent decrease in 
luciferase activity when these cells were incubated with 
4. Since the WST- 1 viability results showed no observable 
toxicity of 4 to the Cl-S cells at the assay concentrations, 
we speculated that this effect could result from an off- 
target effect of compound 4 causing a global decrease in 
protein synthesis. The concentration of total protein in 
cell lysates of both C5-14 and Cl-S cells, as measured 
by Bradford assay (Supplementary Figures S12 and 
SI 3), indeed showed a decreasing level of protein expres- 
sion with increasing concentration of 4. The fact that this 
effect is not observed for 10 and 11 likely results from 
their greater selectivity for CUG repeats. Thus, CUG exp 
binding compounds are capable of releasing the seq- 
uestered mRNA transcripts into the cytoplasm for 
translation. 
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Figure 5. Compounds are able to promote expression of a (CUG) 800 - 
containing luciferase construct in mouse myoblasts. Firefly luciferase 
activity is plotted as a ratio of the normalized luminescence from 
cells containing (CUG) g0() in the 3'-UTR of the luciferase mRNA 
(C5-14) to normalized luminescence of cells containing no CUG 
repeats (Cl-S). Error bars indicate 1 SD of luminescence from the 
average of three replicate wells. 



Compounds partially restore splicing in a mouse model 
of DM1 

Of course, a critical test of a compound's performance is 
its ability to function in vivo. DM1 has several well-studied 
mouse models (31). We examined the activity of com- 
pounds 4, 9 and 11 in the HSA hK mouse model, in 
which transgenic mice carry a long CTG repeat inserted 
into the human skeletal actin (HSA) gene in skeletal 
muscle (50). These mice exhibit several DM1 -like pheno- 
typic characteristics. They also display misregulated alter- 
native splicing of genes, including Clcnl and Atp2al, 
which are physiological targets of MBNL1 regulation. 
Not all compounds were tested due both to the complexity 
and cost of mouse studies, and due to the limited avail- 
ability of compound 10. 
To examine effects on MBNL1 -regulated splicing events 



we administered compounds 4, 9 or 11 to HSA hK trans- 
genic mice by daily intraperitoneal injection of 40mg/kg 
for 5 days. In the initial experiment, analysis of splicing 
in hind limb (quadriceps) muscle showed that compound 
4 produced a modest but statistically significant 
(P = 0.0379) improvement of Clcnl splicing relative to 
saline-injected controls, whereas compound 9 did not 
(Figure 6). Statistically significant improvements in both 
Clcnl and Atp2al (P = 0.0419 and P = 0.0223, respect- 
ively; Supplementary Figure SI 4) were observed following 
treatment with compound 11. To confirm these results, we 
carried out a further study in which five age-matched mice 
were treated with compound 11 via interperitoneal injec- 
tion at 40mg/kg once daily for 5 days, and compared with 
five age-matched controls. Once again, 11 provided a stat- 
istically significant improvement in Atp2al and Clcnl 
splicing. Ttn, another transcript dependent on MBNL1- 
mediated splicing, also showed improvement (Figure 7). 
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Figure 6. Compound 4 improves MBNL1 -dependent splicing in vivo. 
Asterisk (*) indicates statistical significance (P = 0.0379); the dashed 
line is provided as a reference to splicing levels for HSA l ' R mice 
(n - 3 for experimental, control and wild-type) in the absence of 
compound (control). The error bars indicate SEM. 



Splicing of Capzb, a developmentally regulated exon 
whose splicing does not depend on MBNL1, was not 
affected. 



DISCUSSION 

Using a moderate-affinity ligand identified from a 
resin-bound dynamic combinatorial library as a starting 
point, we have successfully developed next-generation lead 
compounds able to selectively bind DM1 and DM2 RNA 
with high affinity. These compounds represent to our 
knowledge the first use of the benzo[g]quinoline moiety 
in an RNA-binding context, an important advance in 
that this substructure allows for direct visualization of 
compounds in cells via fluorescence. The strong selectivity 
of 11 for CUG repeats is particularly notable, as is its 
enhanced affinity for longer CUG repeat sequences. 
Selectivity for longer repeats is potentially highly advan- 
tageous: since isolated and short CUG repeat sequences 
are found throughout the transcriptome, binders must dif- 
ferentiate between these and longer repeats in order to 
avoid off-target effects in vivo. 

As stated at the outset, our inclusion of the 
benzo[g]quinoline moiety in second-generation com- 
pounds was predicated on the assumption that binding 
of lead compounds to CUG repeats involved intercalation 
of the quinoline moieties. Therefore, the stoichiometries 
for 4 and 5 binding these repeats as determined by SPR 
were surprising. In each case, binding of 1 molecule of 4 or 
5 per CUG via a mode in which both benzo[g]quinolines 
intercalate would require violation of the neighbor- 
exclusion rule (51). Further studies will be essential to 
fully understanding the structural parameters of binding 
by these two compounds, as well as the others described in 
this article. 

Several of the molecules synthesized show low toxicity 
in human fibroblast and mouse myoblast cell lines, sug- 
gesting the high sequence selectivity displayed by these 
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Figure 7. Compound 11 improves MBNLl-dependent splicing in vivo. 
Asterisks (*) indicate statistical significance (P < 0.005); the dashed line 
is provided as a reference to splicing levels for HSA LR mice in the 
absence of compound (n = 5 for compound-treated and control). The 
level of an MBNL1 -independent splicing event (Capzb) is unaffected by 
11. Error bars indicate SEM. 



structures is transferable to a biological context. Com- 
pounds examined in this work are able to release the 
nuclear retention of a CUG repeat-containing transcript 
in a concentration-dependent manner in mouse myoblasts, 
and improve splicing of MBNLl-dependent targets in a 
mouse model of DM1. Thus, we have demonstrated for 
the first time that DCC can serve as a starting point from 
which to develop high-affinity sequence-selective RNA 
binding compounds with desirable biological activity 
in vitro (in cell culture) and in vivo (in mice). Activity 
in vivo was observed to correlate with in vitro potency. 
Interestingly, we also observed that mice treated with 4 
exhibited significant acute toxicity effects while those 
treated with 11 did not. While the higher (CUG exp ) 
sequence selectivity of 11 may contribute to its lower 
toxicity, further studies will be essential to understand 
these differences in detail. The amount of splicing im- 
provement following treatment with 4 or 11 following 
daily interperitoneal injection at 40mg/kg for 5 days is 
similar to that produced by pentamidine (8) at a dosage 
of 40 mg/kg once daily for 7 days, but we anticipate the 
greater selectivity of the compounds, we describe 
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(particularly 11) may provide a more favorable pathway 
for future development. 

In order to build on these initial successes, it will be 
necessary to increase the amount of compound-dependent 
splicing improvement observed in mouse models of DM 1 . 
It is possible that this goal can be at least partially accom- 
plished by modification of the dosage and/or dosing 
schedule. Further improvement of RNA binding affinity, 
bioavailability and stability will also likely prove essential. 
Medicinal chemistry efforts based on compound 11 along 
those lines are currently in progress, as we continue to 
make strides toward the development of a viable 
anti-DMl therapy. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: Sup- 
plementary Tables 1—44, Supplementary Figures 1-14 and 
Supplementary Methods. 
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